Skip to content

Implement a new Interface function for datastore_before_update#275

Merged
duttonw merged 13 commits into
ckan:masterfrom
okfn:datastore_before_update
Apr 23, 2026
Merged

Implement a new Interface function for datastore_before_update#275
duttonw merged 13 commits into
ckan:masterfrom
okfn:datastore_before_update

Conversation

@avdata99

@avdata99 avdata99 commented Apr 15, 2026

Copy link
Copy Markdown
Member

Adding the datastore_before_update hook to IXloader
I need to hook into the datastore updates to detect field changes

This PR adds a new optional method to the IXloader interface so plugins can observe schema changes before XLoader modifies the DataStore table.

The datastore_before_update call includes:

  • existing_fields: list of current DataStore field dicts (each has id, type, and optionally info), or None if the table does not yet exist. The internal _id column is stripped.

  • new_headers: list of field dicts about to be written.

  • tests added. Ran here because of this repo permission

Comment thread ckanext/xloader/loader.py Outdated
'''
fields_match = _fields_match(fields, existing_fields, logger)
if fields_match == FieldMatch.EXACT_MATCH:
_notify_datastore_before_update(resource_id, existing_info, fields)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 33 uses kwargs, this uses args. Should be consistent.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here okfn@8b2a706 @ThrawnCA

@ThrawnCA ThrawnCA left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks for adding it!

Just need to make the calls to the new function use a consistent argument style.

Comment thread ckanext/xloader/interfaces.py
Comment thread ckanext/xloader/tests/test_loader.py Outdated
mimetype="text/csv",
logger=logger,
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two blank lines?

@avdata99 avdata99 Apr 22, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One line removed here
b6a8b2e

Comment thread ckanext/xloader/interfaces.py Outdated

.. warning::

The ``existing_fields`` and ``new_headers`` lists are the

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the existing_fields object is the same after removing _id.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 6e13330

@ThrawnCA ThrawnCA left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duttonw @wardi Thoughts?

@wardi

wardi commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

IIUC this is only useful for logging, but it looks fine to me.

@ThrawnCA

Copy link
Copy Markdown
Collaborator

IIUC this is only useful for logging, but it looks fine to me.

Theoretically a plugin could do things like overriding the data dictionary for specific fields when a resource is first created, instead of updating it later. Or eg allowing numeric fields while preventing any field from being treated as a timestamp.

@duttonw

duttonw commented Apr 22, 2026

Copy link
Copy Markdown
Collaborator

Hi @avdata99 ,

Overall the logic you wish to add looks good and should also allow other plugins to 'block' table changes if the wrong file is uploaded with column's changed, i.e. structured data used in api/rest calls or join's and should not be altered.

Do ensure your new unit tests to also be passing. (It may be best to use a different branch for quick iterations and then squash the commits then merge them onto this branch for cleaner history).
It seems it was passing on the 16th but your code changes after that made it fail. https://github.com/avdata99/ckanext-xloader/actions/workflows/test.yml?query=branch%3Adatastore_before_update++

https://github.com/avdata99/ckanext-xloader/actions/runs/24799367527/attempts/1#summary-72577605178 (4 tests failing)

Also, @wardi it looks like we will all need to put some time into updating xloader as we now have 44 baseline tests having failed against latest ckan/master branch. :'(
============= 48 failed, 84 passed, 5 skipped in 105.31s (0:01:45) ============= https://github.com/avdata99/ckanext-xloader/actions/runs/24799367527/attempts/1#summary-72577605148

@avdata99

avdata99 commented Apr 23, 2026

Copy link
Copy Markdown
Member Author

Good catch @duttonw. Tests were updated here 94d7b68

Now they're green https://github.com/avdata99/ckanext-xloader/actions/runs/24833308497

@duttonw duttonw merged commit 8d787c3 into ckan:master Apr 23, 2026
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants